Jet substructure as a new Higgs search channel at the LHC 
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Abstract. These proceedings discuss a possible new search strategy for a light Higgs boson at the LHC, in high-pf WH and 
ZH production where the Higgs boson decays to a single coUimated jet. Material is included that is complementary to what 
was shown in the original article, larXiv:0802.2470l 
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The search for the Higgs boson is one of the main 
priorities for the Tevatron and the LHC. Current elec- 
troweak precision fits OJ] suggest that the Higgs boson 
may be light, i.e. with a mass ;«// ~ 120GeV. This region 
of mass is one of the most difficult in which to discover 
the Higgs boson, in part because its main decay channel, 
to bb, is swamped by large QCD backgrounds. Accord- 
ingly most search strategies rely either on looking for 
rare but characteristic decay channels such as // ^ 77, or 
alternatively for H ^ bb decays in production channels 
with an associated, leptonically-decaying W or Z boson, 
which provides an electroweak "tag" that is rare in back- 
grounds. 

While this second approach seems promising at the 
Tevatron f?], studies from a few years ago |3] suggested 
that it would be very challenging at the LHC. The diffi- 
culty is clearly visible in fig. [T] which shows simulated 
background (dashed line) and background-nsignal (solid 
line) distributions. Not only is the ratio of S/\fB rather 
low, but the signal is a tiny addition to a background with 
strong kinematical structure near the generated Higgs 
mass, an artefact due to transverse momentum cuts in the 
analysis and a significant background. Fig.[T]implies a 
need for exquisite control of the background shape if the 
Higgs boson is to be identified here. 

Recently we suggested |4] that the WH and ZH chan- 
nels might be recovered as potential discovery channels 
by restricting one's attention to the ^ 5% of events in 
which the vector and Higgs bosons each have a large 
transverse momentum, ptv — PtH > 200 GeV and are 
back-to-back. As we shall discuss in more detail later, 
this is advantageous (despite the large reduction in signal 
cross-section) because it will greatly increase the ratio of 
signal to background, and largely eliminate the problem 
of the non-trivial "shape" of the background. 
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FIGURE 1. Background (dashed line) plus Higgs signal 
(solid line) for a Higgs boson with mn = lOOGeV in the 
pp WH, W £^v and H ^ bb channel, as found in the 
ATLAS TDR study 13J for 30fb"' . 



To investigate the potential of such a high-/?, Higgs 
search, ref. [4] carried out a hadron-level analysis, which 
can be factored into a leptonic side and a hadronic 
side. The leptonic side is straightforward: electrons and 
muons are considered to be identified if they have pt > 
30GeV and Tj < 2.5, missing energy is considered if 
I^T > 30 GeV, and one looks for events consistent with 
Z — > Z vv ovW i'^V and places a cut on the 
total Pt of the vector boson, p,v > pt.min- 

The hadronic side requires more sophistication if one 
is to maximise signal to background ratios and obtain 
a good mass resolution. The high-/?, Higgs decays to a 
bb pair, which leads to a single broad jet. Until recently, 
the state-of-the-art approach isllaLZl] for identifying such 
decays exploited the hierarchical nature of the kt algo- 
rithm (stlst]. This is effective in rejecting backgrounds 
but suffers from poor mass resolution, while split-merge- 
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FIGURE 2. Mass drop and filtering procedure. 



based cone algorithms often give better mass resolution 
but with poor background rejection. 

A powerful mix of the two approaches can be con- 
structed 1 4] using the Cambridge/Aachen algorithm [1(X 
[Till , which successively recombines the closest pair of 
particles (or pseudojets) in the event until all are sepa- 
rated by a rapidity-azimuth distance of at least R. After 
having clustered the event and identified a hard massive 
jet, one undoes one step of its clustering. This breaks the 
jet into two subjets: if the heavier subjet is significantly 
lighter than the original jet and the sharing of momen- 
tum between the two subjets is not too asymmetric^ then 
one works with the hypothesis that the two subjets cor- 
respond to the b and b from the Higgs decay. Otherwise 
one discards the lighter subjet and repeats the uncluster- 
ing procedure on the heavier subjet. 

Once one has a Higgs candidate, one verifies that both 
subjets have a b-tag. This together with the symmetry 
cut helps eliminate much of the background. The mass 
resolution, however, is not very good at this stage: by 
triggering on the mass drop, one has an effective jet ra- 
dius (Rhi,) that corresponds closely to the ideal radius for 
capturing all radiation from the decaying Higgs boson; 
but one also captures much underlying event (UE), which 
degrades the mass resolution. The next step is therefore 
to further undo the clustering to an effective radius of 
min(0.3,0.5/?^5) and take the 3 hardest resulting subjets: 
the b, b and the hardest emitted gluon. This eliminates 
much of the UE, while keeping most of the hard pertur- 
bative radiation from the Higgs decay. 

The procedure is summarised in fig. |2] and illustrative 
invariant-mass distributions at the different stages are 
given in fig. [3] showing how the mass-drop is the critical 
stage for eliminating the background, while filtering is 
crucial for obtaining good mass resolution on the signal. 

To test the potential of a high-/?, VH analysis for 
Higgs discovery, ref. |4] considered simulated VH events 
(V — W,Z), and backgrounds from Vj (including Vbb), 
VV, tt, single-top and dijet events, generated with Her- 
wig fl^l (and UE from Jimmy[13]). The precise results 
depend on the choice of R for the jet finder and 

the /j-tagging efficiency and fake rate. The main results 



^ "Significantly lighter" and "not too asymmetric" involve cuts that can 
be chosen based on considerations of leading order QCD emission. The 
specific values of the cuts that were used are described in 01 ■ 
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FIGURE 3. Invariant mass distribution of the hadronic part 
of high-pr ZH and Zbb events, for the hardest jet (light sohd 
line, using R = 1.2), after the mass drop (light dashed hne) 
and after filtering (dark solid line). The normalisation is arbi- 
trary (and unrelated) in the two plots. Events simulated with 
Herwig 6.5 fl2h . Jimmy 4.3 1 13] and reconstructed with Fast- 
Jet 2.3 ifH. 
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FIGURE 4. Simulated invariant mass distributions in three 
different channels, together with the combined results (bottom 
right). 



of IJD were for /:>,.„„■„ — 200GeV, R~1.2 and ^eff/fake = 
60%/2%. Here we complement this with results for 
P,,,mn = 300GeV, R = 0.7 and /7efr/fake = 70%/!%, fig.g] 
where the number of events corresponds to 30 fb^ ' , with- 
out any /T-factors. This shows a clear mass peak around 
ifiH = 115 GeV, together with a potentially very useful 
control peak from VZ events with Z — > bb. The value 
of signal / ^/background is about 5.5 for the combina- 
tion of all leptonic channels, based on a mass window of 
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FIGURE 5. Dependence of the result for 5/ \/B on b mistag 
rate (left) and on the Higgs-boson mass (right), for various 
combinations of Pi,muu R and Z?-tagging efficiency. 

TABLE 1. Approximate impact of different as- 
pects of the high-p, analysis on the signal and back- 
ground in ly/Z-associated Higgs production, rela- 
tive to the low pt analysis. 





Signal 


Background 


Eliminate tt, etc. 




xl/3 


p, > 200 GeV 


xl/20 


xl/60* 


improved acceptance 


x4 


x~ 


twice better resolution 




xl/2 


add Z^vv 


xl.5 


xl.5 


total 


xO.3 


xO.017 



* for Wbb and Zbb backgrounds 



16 GeV, which is roughly compatible with expected ex- 
perimental resolutions. The dependence of the result on 
the /7-tagging scenario and the Higgs mass is shown in 
fig. H] and one sees that the channel remains viable even 
with worse /j-tagging and for masses up to ^ 130 GeV. 

One may ask how it is that by throwing out 95% of 
the signal one can improve the chances of discovery 
compared to the analysis of |3]. The answer involves 
many aspects: for example the tt background is nearly 
completely eliminated because it is difficult for tt events 
to produce a collimated bb pair that recoils against a 
high- pi W-boson; other backgrounds fall faster with p,; 
signal acceptance and mass resolution improve at high- 
p,; and new signal channels arise (Z vv). The impact 
of each effect is summarised in table [T] (whose entries 
are approximate because we have not fully repeated the 
analysis of [3]). Additionally, the high-/?, analysis is free 
of cut and top-induced artefacts, making it much easier 
to claim discovery once one identifies a mass peak. 

In considering the above results it is important to bear 
in mind that they are based on hadron-level simulation. 
Ultimately, this channel's degree of competitiveness for 
Higgs discovery (notably compared to gg ^ H ^ 77) 
will depend on the detailed detector performance (studies 
are in progress), as well as possible future improvements 
(e.g. separate consideration of 200 < piv < 300GeV 
and Pfv > 300GeV). Nevertheless we believe that our 



choices at hadron-level (e.g. the mass-window width) are 
sufficiently conservative that there is a high likelihood 
that this will be fruitful channel for LHC Higgs studies. 
This is important as it is the only channel that can provide 
separate measurements of WH and ZH couplings, and 
the control Z-peak will provide a constraining standard 
candle for normalisation (especially once all diboson 
production channels have been calculated to NNLO). 

Finally, the analysis discussed here may have more 
general lessons: one is that in searches dominated by 
large backgrounds with cut-induced (or top-induced) 
artefacts in the mass distribution, going to high pt can 
help limit their impact. Another is that other studies 
involving highly-boosted heavy objects {W,Z,H,t, see 
e.g. ifTsIl) stand to benefit significantly from the new 
mass-drop and filtering jet techniques developed here, as 
has already been seen in a related high-p, tt analysis [ 16] . 
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